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(54) Programmable logic devices with function-specific blocks 



(57) A programmable logic integrated circuit device 
has at least one function -specific circuit block (e.g., a 
parallel multiplier, a parallel barrel shifter, a parallel 
arithmetic logic unit, etc.) in addition to the usual multiple 
regions of programmable logic and the usual program- 
mable interconnection circuit resources. To reduce the 
impact of use of the function -specific block ("FSB") on 
the general purpose interconnection resources of the 
device, inputs and/or outputs of the FSB may be coupled 



relatively directly to a subset of the logic regions. In ad- 
dition to conserving general purpose interconnect, re- 
sources of the logic regions to which the FSB are con- 
nected can be used by the FSB to reduce the amount 
of circuitry that must be dedicated to the FSB. If the FSB 
is a multiplier, additional features include facilitating ac- 
cumulation of successive multiplier outputs (using either 
addition or subtraction and with sign extension if de- 
sired) and/or arithmetically combining the outputs of 
multiple multipliers. 
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Description 

[0001] This appiication claims the benefit of U.S. pro- 
visional patent application No. 60/233,389, filed Sep- 
tember 1 8, 2000, which is hereby incorporated by r fer- 5 
ence herein in its entirety. 

Background of the Invention 

[0002] This Invention relates to programmable logic 10 
integrated circuit devices, and more particularly to in- 
cluding function -specific blocks such as multipliers, 
arithmetic logic units, barrel shifters, and/or the like in 
programmable logic devices. 

[0003] Programmable logic devices ("PLDs") are well is 
known as is shown, for example, by Jefferson et al. U. 
S. patent 6,21 5,326 and Ngai et al. U.S. patent applica- 
tion No. 09/516,921 , filed March 2, 2000. PLDs typically 
include many regions of programmable logic that are in- 
t rconnectable in any of many different ways by pro- 20 
grammable interconnection resources. Each logic re- 
gion is programmable to perform any of several logic 
functions on input signals applied to that region from the 
interconnection resources. As a result of the logic func- 
tion^) it performs, each logic region produces one or 25 
more output signals that are applied to the interconnec- 
tion resources. The interconnection resources typically 
include drivers, interconnection conductors, and pro- 
grammable switches for selectively making con nections 
between various interconnection conductors. The inter- 
connection resources can generally be used to connect 
any logic region output to any logic region input; al- 
though to avoid having to devote a disproportionately 
large fraction of the device to interconnection resources, 
it is usually the case that only a subset of all possible 
interconnections can be made in any given programmed 
configuration of the PLD. Indeed, this last point is very 
important in the design of PLDs because interconnec- 
tion resources must always be somewhat limited in 
PLDs having large logic capacity, and interconnection 
arrangements must therefore be provided that are flex- 
ible, efficient, and of adequate capacity without displac- 
ing excessive amounts of other resources such as logic. 
[0004] Although only logic regions are mentioned 
above, it should also be noted that many PLDs also now 
include regions of memory that can be used as random 
access memory ("RAM"), read-only memory ("ROM"), 
content addressable memory ("CAM"), product term ("p- 
term") logic, etc. 

[0005] As the capacity and speed of PLDs has in- 
creased, there has been increasing interest in using 
them for signal or data processing tasks that may in- 
volve relatively large amounts of parallel information 
and that may require relatively complex manipulation, 
combination, and recombination of that information. 
Large numbers of signals in parallel consum a corre- 
spondingly large amount of interconnection resources; 
and each tim that information (or another combination 
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or recombination that includes that information) must be 
routed within the device, another similar large amount 
of the interconnection resources is consumed. Im- 
proved PLD architectures are n eded to better address 
these issues. 

Summary of the Invention 

[0006] A PLD in accordance with this invention in- 
cludes a plurality of regions of programmable logic cir- 
cuitry, general purpose interconnection circuitry that is 
programmably configurable to allow outputs of substan- 
tially any of the regions to be applied to inputs of sub- 
stantially any of the regions, function-specific circuitry, 
and routing circuitry that is programmably configurable 
to route outputs of the f unctionrspecific circuitry to only 
a subset of the regions. 

[0007] A function-specific block ("FSB") typically has 
a plurality of parallel inputs and a plurality of parallel out- 
puts. An FSB is at least partly hard-wired to perform a 
particular task or tasks on its inputs to produce its out- 
puts. The task(s) performed by an FSB may wholly or 
partly, programmably or dynamically, selectable. Exam- 
ples of FSBs include parallel multipliers, parallel arith- 
metic logic units ("ALUs"), barrel shifters, and the like. 
[0008] In order to reduce the impact of including FSBs 
on the interconnection resources of the PLD, any or all 
of several techniques respecting the interconnection re- 
sources may be used in accordance with this invention. 
30 One technique is to derive inputs for the FSB from in- 
terconnection resources that are already fairly local (i. 
e., close) to the inputs of other resources such as logic 
regions (or memory regions if memory regions are in- 
cluded (although borrowing inputs from logic is present- 
35 |y preferred)). In this way the FSB effectively shares sub- 
stantial amounts of input routing resources with those 
other (logic/memory/etc.) resources. A smaller fraction 
of the overall interconnection resources must be dedi- 
cated to providing FSB inputs, and the impact on use of 
40 the more global (as opposed to the local) interconnec- 
tion resources is especially reduced. (Global intercon- 
nection resources include relatively long interconnec- 
tion conductors, in contrast to the relatively short con- 
ductors that can be used for more local interconnec- 
ts tions. Accordingly, it is "more expensive" to use a global 
interconnection conductor than a local interconnection 
conductor. Also, global interconnection conductors tend 
to be slow and to require drive by power-consuming driv- 
ers, whereas local conductors tend to be faster and may 
so not require additional drivers.) Sharing an interconnec- 
tion resource between an FSB input and another logic/ 
memory/etc. resource input may reduce or even sacri- 
fice the usability of the other resource when the PLD is 
configured to use the FSB, but that can be preferable to 
55 having to provide more interconnection resources that 
are dedicated to providing FSB inputs. 
[0009] Another techniqu that can b used tor due 
the impact of an FSB on the interconnection resources 
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of a PLD is to use relatively local interconnection re- 
sources forthe outputs of the FSB. These local resourc- 
es can be used to supply the FSB outputs to the inputs 
(or other relatively local interconnection resources lead- 
ing to the inputs) of particular subsets of other resources 
such as logic regions on the PLD. This avoids the need 
for drivers and/or more global interconnection conduc- 
tors dedicated to the FSB outputs. If FSB output driving 
by drivers is needed, the output drivers of the immedi- 
ately above-mentioned logic regions can be used. Sim- 
ilarly, if the FSB outputs need registering, the registers 
of these logic regions can be used. And these logic re- 
gions can even be used to at least begin further logical 
and/or arithmetic manipulation of the FSB outputs. Once 
again, this effective sharing of certain FSB output func- 
tions with logic regions may reduce or sacrifice the use- 
fulness of those logic regions for other purposes when 
th FSB is being used, but that can be preferable to hav- 
ing to provide more dedicated interconnection resourc- 
es to support the FSB. 

[0010] Other aspects of the invention may be used to 
facilitate providing arithmetic accumulation of succes- 
sive FSB (especially multiplier) outputs (using either ad- 
dition or subtraction), addition or other logical combina- 
tion of multiple concurrent FSB (especially multiplier) 
outputs, sign extension of FSB (especially multiplier) 
outputs, registration of FSB inputs and/or outputs, etc. 
[0011] Furtherfeatures of the invention, its nature and 
various advantages will be more apparent from the ac- 
companying drawings and the following detailed de- 
scription. 

Brief Description of the Drawings 

[0012] FIG. 1 is a simplified schematic block diagram 
of representative portions of an illustrative embodiment 
of a programmable logic device constructed in accord- 
ance with the invention. 

[0013] FIG. 1A is a more detailed but still simplified 
schematic block diagram illustrating a particular feature 
of the FIG. 1 device. 

[0014] FIG. 2 is a simplified schematic block diagram 
showing an illustrative use of portions of the FIG. 1 cir- 
cuitry in accordance with the invention 
[0015] FIG. 3 is a simplified schematic block diagram 
showing another illustrative use of portions of the FIG. 
1 circuitry in accordance with the invention. 
[001 6] FIG. 4 is a simplified schematic block diagram 
showing still another illustrative use of portions of the 
FIG. 1 circuitry in accordance with the invention. 
[0017] FIG. 5 is a simplified schematic block diagram 
showing yet another illustrative use of portions of the 
FIG. 1 circuitry in accordance with the invention. 
[001 8] FIG. 6 is a simplified schematic block diagram 
showing still another illustrative use of portions of th 
FIG. 1 circuitry in accordance with the invention. 
[001 9] FIG. 7 is a simplified block diagram of prior art 
circuitry that can be readily implement d in circuitry of 



the type shown in FIG. 1 in accordance with the inven- 
tion. 

[0020] FIG. 8 is a simplified block diagram of other 
prior art circuitry that can be readily implemented in cir- 
5 cuitry of the type shown in FIG. 1 in accordance with the 
invention. 

[0021] FIG. 9 is a simplified schematic block diagram 
showing a representative portion of a particular aspect 
of the FIG. 1 circuitry in more detail. 
10 [0022] FIG. 1 0 is a schematic block diagram illustrat- 
ing one possible configuration and use of a portion of 
the FIG. 1 circuitry. 

[0023] FIG. 11 is generally similar to FIG. 9 but illus- 
trates a further possible feature of the invention. 
15 [0024] FIG. 1 2 is a table illustrating a conventional bi- 
nary coding scheme that may be used in certain aspects 
of the invention. 

[0025] FIG. 13 is a table illustrating another conven- 
tional binary coding scheme that may be used in certain 

20 aspects of the invention. 

[0026] FIG. 14 is a simplified schematic block diagram 
of an illustrative embodiment of circuitry that may be 
used in accordance with the invention. 
[0027] FIG. 1 5 is a simplified schematic block diagram 

25 of a representative portion of another illustrative embod- 
iment of the invention. 

[0028] FIG. 1 6 is a simplified block diagram showing 
illustrative use of the FIG. 1 5 circuitry in accordance with 
the invention. 

30 [0029] FIG. 17 is a simplified schematic block diagram 
of a representative portion of still another illustrative em- 
bodiment of the invention. 

[0030] FIG. 1 8 is a simplified schematic block diagram 
generically illustrative of several possible embodiments 
35 of the invention. 

[0031] FIG. 19 is similar to FIG. 18 and generically 
shows additional possible features of several illustrative 
embodiments of the invention. 

[0032] FIG. 20 is a simplified schematic block diagram 
40 of a representative portion of an illustrative embodiment 
of the invention. 

[0033] FIG. 21 is a simplified schematic block diagram 
of a representative portion of an illustrative embodiment 
of the invention. 
45 [0034] FIG. 22 is a simplified schematic block diagram 
of an illustrative system employing a programmable log- 
ic device in accordance with the invention. 

Detailed Description 

50 

[0035] The invention will be at least initially described 
with greatest emphasis on inclusion of parallel multipli- 
ers in PLDs, but other examples of FSBs will also be 
mentioned and described, and from the overall disclo- 
ss sur it will be apparent to those skill d in the art how the 
invention can be applied to any of many different types 
and constructions of FSBs. 

[0036] Th illustrative PLD 10 shown in FIG. 1 in- 
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eludes a two-dimensional array of intersecting rows and 
columns of "super-regions" 20 of programmable logic 
and other resources. Each super-region 20 includes a 
plurality of "regions" 30 of programmable logic, a region 
40 of memory, and an FSB 50, which in this example is 
dedicated (i.e., at least partly hard-wired) parallel multi- 
plier circuitry. Each super-region 20 also includes some 
relatively local interconnection resources such as pro- 
grammable logic connectors ("PLCs") 60, the regions of 
interconnection conductors and PLCs labeled 70, logic- 
element-feeding conductors 80, memory-region-feed- 
ing conductors 90, and other conductors 100, 110, 120, 
tc. (Throughout the accompanying drawings many el- 
ements that are actually provided in multiple instances 
are represented by just single lines or other single sche- 
matic symbols. Thus, for example, each PLC 60 in FIG. 
1 is actually representative of many instances of such 
PLC circuitry. As another example, each line 110 in FIG. 
1 is actually representative of many parallel conductors 
110.) 

[0037] Each region 30 includes a plurality of "logic el- 
ements" 130. Each logic element (or "logic module" or 
"subregion") 130 is an area of programmable logic that 
is programmable to perform any of several logic tasks 
on signals applied to the logic element to produce one 
or more logic element output signals. For example, each 
logic element 1 30 may be programmable to perform one 
place of binary addition on two input bits and a carry-in 
bit to produce a sum-out bit and a carry-out bit. Each 
logic element 130 also preferably includes register (flip- 
flop) circuitry for selectively registering a logic signal 
within the logic element. 

[0038] Conductors 80 apply what may be thought of 
as the primary inputs to each logic element 130 (al- 
though logic elements may also have other inputs). The 
outputs of logic elements 130 are not shown in FIG. 1 
to avoid over-crowding the drawing. However, those 
outputs typically go to local interconnect resources 70 
and other more general-purpose interconnection re- 
sources such as the global interconnect 140 associated 
with the row of super-regions 20 from which that logic 
element output came. There may also be another level 
of horizontal, general purpose interconnect associated 
with each super-region 20 that is not shown in FIG. 1 
(again to avoid over-crowding the drawing). This would 
include conductors that extend across the super-region 
and that are usable for conveying signals between the 
regions 30 and 40 in that super-region. The output sig- 
nals of the logic elements 130 in each super-region 20 
are also typically applied to that level of interconnect, 
and that level of interconnect also typically provides ad- 
ditional inputs to PLCs 60. 

[0039] PLCs 60 (of which there are many for each lo- 
cal interconnect region 70) are programmable (e.g., by 
associated function control elements ("FCEs")) to select 
any of th ir inputs for output to the associated local in- 
terconnect 70. Each local int rconnect 70 is program- 
mable (again by FCEs) to route th signals it receives 



to the adjacent logic elements 1 30 or memory region 40, 
or in certain cases to FSB 50. 

[0040] V rtical global interconnection resources 150 
are provided for making general purpose interconnec- 

5 tions between the rows of super-regions 20. 

[0041] Terms like "super-region", "region", and "logic 
element" or the like are used herein only as relative 
terms to indicate that relatively small elements may be 
grouped together in larger elements or units. These 

10 terms are not intended to always refer to circuitry of any 
absolute or fixed size or capacity. And indeed, if a hier- 
archy of relative sizes is not relevant in a particular con- 
text, these various terms may be used interchangeably 
or as essentially generic to one another. For example, 

15 in the above Background section the term "region" is 
used in this generic way. 

[0042] Additional consideration of the term "PLC" is 
also appropriate at this point. Although thus-far de- 
scribed as being programmabfy (and therefore statically 

20 or relatively statically) controlled (e.g., by FCEs), it will 
be understood that some or all elements referred to 
herein as PLCs may be alternatively controlled in other 
ways. For example, a PLC may be controlled by a more 
dynamic control signal (e.g., a logic signal on PLD 10 

25 that can have different logic levels at different times dur- 
ing the post-configuration, "normal" logic operation of 
the PLD). Although such dynamic control of a PLC may 
mean that the PLC is not, strictly speaking, a "program- 
mable" logic connector, nevertheless the term "PLC" will 

bo continue to be used as a generic term for ail such gen- 
erally similar elements, whether statically or dynamically 
controlled. 

[0043] Continuing now with the discussion of FIG. 1 , 
each FSB 50 may be a 1 6 bit by 1 6 bit parallel multiplier 
35 circuit (i.e., a circuit capable of arithmetically multiplying 
together two parallel 16-bit inputs to produce a parallel 
32-bit product output). (Although 16x16 parallel multi- 
pliers are often referred to in the specific illustrative em- 
bodiments discussed herein, it will be understood that 
40 parallel multipliers of any other sizes can be used in- 
stead if desired. In general, these multipliers can have 
size(s) n x m, where n and m are any desired integers 
that are either the same as or different from one anoth- 
er.) Assuming that each FSB 50 is a 16 x 16 parallel 
45 multiplier, each FSB needs two 16-bit input buses and 
a 32-bit output bus. This could take up a substantial 
amount of interconnection resources on PLD 10 if such 
resources were to be dedicated to supporting FSBs 50. 
[0044] To avoid having to dedicate such a large 
so amount of interconnection resources to FSBs 50, each 
FSB 50 is arranged to get its 32 inputs from one (or 
more) of the regions of local interconnect 70 that are 
already provided to supply inputs to adjacent logic re- 
gions 30. In the particular example shown in FIG. 1 , the 
55 local interconnect r gion 70 that is located one such re- 
gion 70 away from the region immediately adjacent to 
an FSB 50 in the same super-region 20 is chosen for 
this purpos . This local int rconnect region (identified 
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as 70* for ease of reference) is chosen because it is not 
used by memory region 40 but is relatively close to the 
FSB 50. Conductors 100 (32 in number) are provided 
from local interconnect region 70* to the associated FSB 
50. When an FSB 50 is being used, these conductors 
100 supply the two numbers to be multiplied (up to 16 
bits each) to the FSB. Thus when an FSB 50 is being 
used, it effectively "steals" some of the local routing pro- 
vided for logic region input. 

[0045] The output signals of each FSB 50 (i.e., up to 
32 bits of multiplier product signals) are conveyed on 
d dicated conductors 110 to selected PLCs 60 in the 
same super-region 20 that includes that FSB. For ease 
of reference these PLCs are identified by reference 
numbers 60*1 , 60*2, 60*3, etc. (generically 60*). The 
output signals of each FSB 50 are also conveyed on 
dedicated conductors 120 to those same PLCs 60* in 
the vertically downwardly adjacent super-region 20. 
Thus the outputs of each FSB 50 are at least initially 
conveyed on dedicated but relatively local conductors 
110 and 120. The FSB 50 outputs therefore do not (at 
least Initially) consume any of the more general-purpose 
routing and do not require dedicated drivers. This saves 
on power and device size. In addition, this (at least ini- 
tial) dedicated output routing for FSBs 50 reduces the 
potential for congestion in the general purpose routing, 
and (as will be shown in detail below) it facilitates adding 
together or otherwise combining multiplier outputs. 
[0046] The PLCs 60* selected to receive FSB 50 out- 
puts in each super-region 20 preferably have the follow- 
ing characteristics: (1) they serve logic elements 130 
that are adjacent to one another in an arithmetic carry 
chain that extends from logic element to logic element, 
and (2) they allow the outputs of two FSBs 50 to be ap- 
plied to the same logic elements 1 30 in pairs of bits (one 
bit of the same order of magnitude from each of the two 
FSBs), with each bit pair being applied to one logic ele- 
ment and with the orders of magnitude of the pairs being 
in the same progression as the progression of orders of 
magnitude in the arithmetic carry chain serving those 
logic elements. In this way, the outputs of two vertically 
adjacent FSBs 50 can be added together by the logic 
elements 130 receiving those FSB outputs. The circuitry 
shown in FIG. 1 is therefore capable of operating as a 
multiplier-adder that makes relatively little use of the 
general purpose interconnection resources of PLD 10, 
at least for its internal operations. In addition to the fore- 
going, the registers of the logic elements 130 that per- 
form the above-described addition can be used to store 
the result of that addition, if desired. The output drivers 
of the logic elements 130 that perform the above-de- 
scribed addition can be used to drive the result of that 
addition out into the general purpose routing (intercon- 
nection resources) of device 10, if desired. 
[0047] If the outputs of two FSBs 50 are not combined 
as described immediately above, then th circuitry 
shown in FIG. 1 permits oth r possible uses. For exam- 
ple, the outputs of an FSB 50 can be appli d to logic 



I ments 130 in the same super-region 20 via the as- 
sociated conductors 110 and PLCs 60* (or in the down- 
wardly adjacent super- region 20 via the associated con- 
ductors 120 and the PLCs 60* in that downwardly adja- 
5 cent super-region). The output drivers of the receiving 
logic elements 130 can be used to drive the FSB 50 out- 
put signals out into the general purpose interconnect of 
PLD 10, with or without intervening registration of those 
signals by the registers of those logic elements. The cir- 
cuitry of and associated with the receiving logic ele- 
ments 130 can be used to provide multiplier-accumula- 
tor ("MAC") operation, wherein each successive FSB 50 
output is arithmetically added (including subtraction as 
a possible alternative) to the contents of the registers of 
the receiving logic elements. Local feedback from the 
logic element register outputs to logic element inputs, 
and the arithmetic capabilities of the logic elements (in- 
cluding the above-mentioned carry chain features) are 
used in this accumulation function. The output drivers 
of the receiving logic element 130 are usable to drive 
the MAC output signals out into the general purpose in- 
terconnect. Again, very little general purpose intercon- 
nect is required to provide the above-described MAC 
functions, especially to support the operations that are 
internal to such a MAC function. 
[0048] From the foregoing it will be appreciated that 
the organization of the circuitry shown in FIG. 1 avoids 
the need to dedicate either output registers or output 
drivers to FSB 50. in applications in which the outputs 
of an FSB 50 require registration and/or driving out into 
the general purpose interconnect of PLD 10, those ca- 
pabilities can be provided by the registers and/or output 
drivers of the logic elements to which the FSB outputs 
are locally conveyed. 

[0049] It will be appreciated that sharing of local inter- 
connect resources 70* between adjacent logic regions 
30 and FSB 50 may at least somewhat sacrifice the us- 
ability of those logic regions when PLD 10 is pro- 
grammed to use FSB 50. It is believed, however, that 
this is more than offset by the avoidance of having to 
provide input routing that is dedicated to FSB 50. FSB 
50 could alternatively share input routing with the asso- 
ciated memory region 40, but it is believed preferable to 
share with logic regions 30 (e.g., because there are gen- 
erally more logic regions than memory regions). As an- 
other possible alternative, each FSB 50 could get its in- 
puts from more than one nearby region 70 of local inter- 
connect. This might make it possible for use of an FSB 
50 to less significantly impact the usability of the logic 
regions 30 also served by those regions 70 because 
less of the resources of each such region 70 would have 
to be turned over to the FSB. On the other hand, such 
an approach might mean that even more of the logic of 
PLD 10 would be impacted by the use of an FSB, and 
that might be less desirable than a greater impact on a 
smaller amount of the logic. 

[0050] On the output sid , use of the output drivers of 
the logic elements 130 that receive the outputs of an 
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FSB 50 to drive the FSB outputs (or signals based on 
those outputs) out into the general purpose interconnect 
of PLD 1 0 saves having to provide separate output driv- 
ers that are dedicated to the FSB outputs. This is a sig- 
nificant saving in an expensive and power-consuming 
resource (i.e., drivers). On the other hand, it may mean 
that the usability of other resources of the logic elements 
that receive the FSB 50 outputs is at least partly sacri- 
ficed when the FSB is being used. But another benefit 
of this approach is the ability to use those other logic 
element resources to effectively extend FSB 50 func- 
tionality to MAC or multiplier-adder operation or to any 
other similar task or tasks, if desired. 
[0051 ] One of the points made above should perhaps 
b further amplified. Because each FSB 50 can produce 
as many 32 parallel outputs, 32 adjacent logic elements 
1 30 in the super-regions 20 that can make use of those 
signals are enabled to receive those signals in order of 
magnitude order. In the example shown in FIG. 1 , these 
32 logic elements in a typical super-region 20 are the 
ten in the left-most region 30, the ten in thesecond-from- 
left-most region 30, the ten in the third-from-left-most 
region 30, and the top two in the fourth-from-left-most 
region 30. (Constructing PLD 1 0 with ten logic elements 
130 per region 30 is only illustrative, and constructing 
regions 30 with any other number of logic elements 1 30 
(e.g. , more than ten or less than ten (even as few as one 
or two)) is also possible. It will be readily apparent how 
distribution of the outputs of FSBs 50 can be changed 
to accommodate different numbers of logic elements 
1 30 in regions 30.) It is assumed in this discussion that 
the carry chain is constructed as shown by the carry 
leads 1 31 in FIG. 1 A, and that it therefore starts with the 
upper-left-most logic element 130, goes from logic ele- 
ment to logic element down the left-most region 30, then 
goes up to the top of the second-from- left-most region 
30, down that region, and so on. Thus the ten least sig- 
nificant output bits of an FSB 50 are applied to the left- 
most groups of PLCs 60*1 in the appropriate super-re- 
gions 20; the ten next more significant outputs of the 
FSB are applied to the second-from-left-most groups of 
PLCs 60*2 in the appropriate super-regions 20; the ten 
next more significant outputs of the FSB are applied to 
the third-from-left-most groups of PLCs 60*3 in the ap- 
propriate super-regions 20; and the two most significant 
outputs of the FSB are applied to the fourth-from-left- 
most groups of PLCs 60*4 in the appropriate super-re- 
gions 20. 

[0052] From PLCs 60*1 , the ten least significant FSB 
50 output bits can be programmably routed (via the as- 
sociated local interconnect 70) into the ten logic ele- 
ments 130 in the left-most regions 30 in order of the sig- 
nificance of those bits (i.e., least significant bit going to 
the top-most logic element 130 (which is at the least sig- 
nificant position in the carry chain); next-more-signifi- 
cant bit going to the next-to -top-most logic element 1 30 
(which isatthe nextmor significant position in the carry 
chain); and so on). From PLCs 60*2, the ten next more 



significant FSB 50 output bits can be programmably 
routed (via the associated local interconnect 70) to th 
ten logic elements 130 in the second-from-left-most re- 
gions 30, again in order of the significance of those bits 

5 so as to continue to match the progression of signifi- 
cance in the carry chain. The same applies for the out- 
puts of PLCs 60*3 and 60*4. In this way the FSB 50 out- 
puts are preferably routed into the logic in a way that 
facilitates use of the receiving logic to arithmetically f ur- 

10 ther process the FSB outputs. 

[0053] It should also be pointed out that deriving the 
inputs for an FSB 50 from a region of local interconnect 
70* that is already provided for use in routing signals to 
logic regions 30 gives the FSB the benefit of flexible in- 

15 put routing because such flexible routing is typically an 
attribute of local interconnect 70. Also, as has already 
been at least suggested, deriving inputs for an FSB 50 
from local interconnect 70* that is associated with logic 
regions 30 rather than with a memory region 40 allows 

20 independent operation of the FSB and the memory re- 
gion in each super-region 20. 

[0054] FIGS. 2-6 show several examples of ways in 
which circuitry of the type shown in FIG. 1 can be used. 
In FIG. 2 the FSB 50 in each super-region 20 is initially 

25 used by the other circuitry in that super-region. For ex- 
ample, the output drivers 138 in or associated with the 
logic elements 130 that receive FSB 50 outputs (in part 
via associated conductors 110) are used to drive the 
FSB output signals out via leads 139 into the general 

30 purpose interconnect of the PLD. Of course, the pro- 
grammable logic of the logic elements 130 that receive 
the FSB 50 outputs may also be used to process the 
FSB output signals prior to driving the resulting signals 
out via elements 138 and 139, if desired. 

35 [0055] FIG. 3 is similar to FIG. 2, except that it shows 
that the FSB 50 output signals can be registered by the 
registers 134 of the receiving logic elements 130 prior 
to driving the registered signals out via elements 138 
and 139. Again, the logic of the receiving logic elements 

40 can also be used to process the FSB output signals prior 
to registration of the resulting signals by registers 134, 
if desired. 

[0056] FIG. 4 is again generally similar to FIGS. 2 and 
3, except that it shows use of the resources of the re- 

45 ceiving logic elements 130 to perform an accumulation 
operation on the output signals of FSB 50 (e.g., to pro- 
vide a multiplier-accumulator ("MAC") capability). The 
output signals of an FSB 50 are combined (i.e., added) 
in the programmable logic 1 32 of the receiving logic el- 

50 ements 1 30 with the outputs of the registers 134 of those 
logic elements to produce new values for storage by the 
registers. The register output signals are also available 
for driving out via elements 138 and 139. 
[0057] FIG. 5 shows use of pairs of super-regions 20 

55 to provide multipli r-adder capability. Considering, for 
example, the upper pair of super-regions 20, the output 
signals of the FSB 50 in th upp r super-region ar ap- 
plied (in part via associated leads 1 20) to receiving logic 
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elements 130 in the lower super-r gion. The output sig- 
nals of the FSB 50 In the lower super-region in the upper 
pair are also applied (in part via associated leads 110) 
to those sam receiving logic elements 130. The pro- 
grammable logic 132 of the receiving logic elements is s 
used to add these two FSB output signals, and the re- 
sults may be output via elements 1 38 and 1 39, either 
with or without registration by registers 134. PLCs 136 
in or associated with the receiving logic elements 130 
are controllable to select whether registered or unregis- 10 
tered signals are output. The lower pair of super-regions 
20 in FIG. 5 operates similarly to the upper pair. 
[0056] FIG. 6 is generally similar to FIG. 5 but shows 
the use of yet another super-region 20 (the one o'n the 
right in FIG. 6) to add the outputs of the two multiplier- 1$ 
adders on the left. General purpose interconnect 
1 40/1 50 is used to route the outputs of the two multiplier 
accumulators on the left to receiving logic elements 1 30 
in the super-region 20 on the right. The programmable 
logic 1 32 in those receiving logic elements is used to 20 
add together the signals from the two multiplier adders. 
The resulting signals can be output via elements 1 38 
and 139 in the super-region 20 on the right, either with 
or without registration by the registers 134 in the logic 
elements performing the addition. PLCs 136 in the su- 25 
per-region 20 on the right determine whetherthe outputs 
are registered or unregistered. 

[0059] Although FIG. 6 shows the final addition being 
performed in a fifth super-region 20 (on the right in FIG. 
6), it will be understood that it could alternatively be per- 30 
formed in whole or in part in one or more of the super- 
regions 20 on the left in FIG. 6. For example, any one 
or more of these super-regions on the left may have suf- 
ficient resources left over from the first level of multipli- 
cation and addition to also perform the second level of 35 
addition. 

[0060] It will be appreciated that FIGS. 2-6 are greatly 
simplified in that they tend to show only single circuit 
paths and single circuits that are merely representative 
of what are typically multiple (e.g., up to 32 in the FIG. 40 
1 example) parallel circuit paths and circuits. FIGS. 2-6 
are also only examples of the many ways that circuitry 
of the type shown in FIG. 1 can be configured (i.e., pro- 
grammed) for use. 

[0061] FIG. 7 shows an example of a frequently need- 
ed circuit function that is readily implemented in PLDs 
constructed as described above. FIG. 7 shows what is 
often referred to as a finite impulse response ("FIR") fil- 
ter in a configuration sometimes called "direct form 2." 
FIR filters are very often needed in digitaJ signal so 
processing ("DSP"). Successive samples of data to be 
processed are shifted in parallel through parallel groups 
of flip-flops 21 Oa-21 Od. Each successive parallel output 
of each flip-flop group 21 0 is multiplied by a resp ctive 
one of parallel coefficients C0-C3 in a respective one of 
parall I multipliers 220a-220d. The concurrent parallel 
outputs of multipli rs 220a-220d are added together in 
parall I add r 230 to produce a final output signal. It will 
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be apparent that everything in FIG. 7 below shift regis- 
ters 210 can be readily implemented in the circuitry 
shown in FIG. 6, which (as has been said) is one pos- 
sible configuration and therefore use of circuitry of the 
type shown in FIG. 1. In circuitry of the type shown in 
FIG. 6 the addition represented by adder 230 in FIG. 7 
is performed in three parts in the three adder logic cir- 
cuitries 132. 

[0062] Another example of circuitry frequently needed 
in digital signal processing ("DSP") is shown in FIG. 8. 
This is an example of an infinite impulse response ("MR") 
filter. The upper part of this circuitry is similar to what is 
shown in FIG. 7. Additional inputs to adder 230 in FIG. 
8 are adder 230 outputs delayed by flip-flops 240a and 
240b and multiplied by additional coefficients C4 and C5 
in parallel multipliers 250a and 250b. As in the case of 
FIG. 7, the upper multiplier and adder part of FIG. 8 can 
be implemented as shown in FIG. 6. The lower multiplier 
and adder part can be implemented in another pair of 
super-regions 20 like either pair on the left in FIG. 6. The 
final output can be produced in another super-region 20 
like the one on the right in FIG. 6, which receives and 
adds the outputs of the above-mentioned additional pair 
of super-regions and the outputs of the super-region on 
the right in FIG. 6. 

[0063] The ease with which circuitry of the type shown 
in FIG. 1 can be configured to implement functions of 
the type shown in FIGS. 7 and 8 demonstrates the use- 
fulness and therefore importance of FIG. 1 type circuitry. 
[0064] FIG. 9 shows that relatively little needs to be 
added to interconnection resources 60/70 to add multi- 
plier outputs 110/120 to the signals available as inputs 
to logic elements 130. Each PLC 60* that will receive a 
multiplier output 110 or 120 already typically receives 
multiple ("N") inputs from the general purpose intercon- 
nect such as the associated global interconnection re- 
sources 140. (In FIG. 9 what was previously represent- 
ed by a single element identified by reference number 
60*1 is shown more completely as multiple, separate 
PLCs 60*1a, 60*1r, 60*1b, 60*1s, etc.) 
[0065] In the depicted illustrative architecture only 
one more input from a multiplier 50 needs to be added 
to each PLC 60*. For example, PLC 60*1 a receives one 
of multiplier outputs 120 in addition to its N inputs from 
resources 140. Similarly, PLC 60*1 r receives one of 
multiplier outputs 110 in addition to its N inputs from re- 
sources 140. Within local interconnection resources 70, 
the output 72 of any one of PLCs 60*1 is programmably 
connectable to any one of the LE 130 inputs 80 served 
by those local interconnection resources by appropri- 
ately programming any appropriate one of programma- 
ble interconnections 74 (also sometimes included within 
generic references to PLCs). Assuming, for example, 
that the logic element 130 shown in FIG. 9 is the one 
that is generally designated to receive the least signifi- 
cant multiplier output bits, the multiplier output lead 1 20 
that is connected to PLC 60*1 a may carry a least signif- 
icant multipli r output bit, and the multiplier output lead 
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1 1 0 that is connected to PLC 60*1 r may carry anoth r 
least significant multiplier output bit. Either of these bits 
can be routed to the logic element 130 shown in FIG. 9 
by appropriately programming the FCEs associated 
with PLCs 60*1 a or 60*1 r and the PLCs 74 serving the 
outputs of those (and other) PLCs 60. Depicted logic el- 
ement 130 can then deal with that signal in any of the 
various ways described above (e.g., output it with or 
without registration, or perform one binary place of MAC 
operation using it). Alternatively, both of these multiplier 
output bits can be routed to respective inputs 80 of de- 
picted logic element 130 via appropriately programmed 
PLCs 60*1 a and 60*1 r and associated elements 72 and 
74. Depicted logic element 130 can process these sig- 
nals as described above (e.g., add them as part of a 
multiplier-adder operation). FIG. 9 shows again that the 
outputs of multipliers 50 are routed to logic elements 1 30 
with little or no use of global interconnection resources 
such as 1 40, 1 50, etc. Moreover, even the impact on the 
routing feeding local interconnection resources 70 is rel- 
atively minor, with only one more input being added to 
each of selected PLCs 60. 

[0066] It will be appreciated (with continued reference 
to FIG. 9) that in the example in which each region 30 
includes ten logic elements 130, each group of PLCs 
60*1 , 60*2, etc., that receives multiplier outputs 110/120 
receives a maximum of ten outputs 110 and a maximum 
of ten outputs 120. Thus, for example, PLC group 60*1 
includes ten PLCs 60*1 a, 1b, 1c, etc., for respectively 
receiving ten multiplier outputs 1 20, and ten more PLCs 
60*1 r, 1s, 1t, etc., for respectively receiving ten multipli- 
er outputs 110. In this architecture, however, each PLC 
group 60 (including PLC groups 60*) includes more than 
20 PLCs. PLCs 60 are therefore only partially populated. 
Of course, if (as has already been mentioned) each re- 
gion 30 has a number of logic elements 130 different 
than ten, then the number of PLCs 60 associated with 
each region 30 will tend to be increased or decreased 
in approximate proportion to the increase or decrease 
in the number of logic elements. The number of those 
PLCs 60* that receive signals 110 or 120 will also in- 
crease or decrease with the increase or decrease in 
number of logic elements in a region, but overall will re- 
main fewer than the total number of PLCs in each PLC 
group. PLCs 60 will therefore remain only partially pop- 
ulated. 

[0067] FIGS, like FIGS. 2-5 and also FIG. 9 show how 
the use of logic elements 1 30 to receive the output sig- 
nals of multipliers 50 allows the resources of those logic 
elements (e.g., the logic element programmable logic, 
registers, output drivers, and/or output routing to the 
general/global interconnect) to be used for the multiplier 
outputs, thereby avoiding the need to provide additional 
such resources that are dedicated to serving the multi- 
plier outputs. Moreover, routing the multiplier outputs to 
and through logic elements allows the multiplier outputs 
to b used in any of the s veral modes illustrated by 
FIGS. 2-6 (i.e., unregistered output, registered output, 



multiplier-accumulator ("MAC") mod , multiplier-adder 
mode with or without registration, etc.). The total amount 
of general/global routing that is r quired to perform and 
add together the results of two multiplications is greatly 
s reduced (e.g., by about 50%). This percentage reduc- 
tion holds for performing and adding together the results 
of any number of multiplications. 
[0068] FIGS. 1 0 and 1 1 illustrate how the circuitry of 
this invention can address an issue that is encountered 
10 in operations like MAC operation. FIG. 10 shows again 
a basic MAC structure (like FIG. 4 but with some specific 
examples of bus widths indicated). In particular, FIG. 1 0 
shows that multiplier 50 may be constructed to multiply 
two words of up to 16 bits each to produce a product 
15 word of up to 32 bits. The adder and register portions of 
the MAC circuitry must have significantly greater capac- 
ity than 32 bits in order to ensure that overflow and/or 
underflow (generically simply "overflow") do not occur 
excessively frequently as a result of accumulating suc- 
20 cessive product words. Thus FIG. 10 shows the adder 
and register portions of the circuitry having capacity ad- 
equate to handle words of up to 40 bits. 
[0069] Because two's complement arithmetic is fre- 
quently used in the intended applications of the circuitry 
25 of this invention, it is desirable to be able to extend the 
sign of the multiplier 50 output to the additional more- 
significant arithmetic places used by adder logic 1 32. (In 
two's complement arithmetic a positive number is 
changed to a negative number of equal absolute value 
30 by inverting all the bits of the positive number and then 
adding 1 (see FIG. 12). Thus the most significant bit of 
ail positive numbers is 0, and the most significant bit of 
all negative numbers is 1 . If word length is increased, a 
0 in the initially most significant place must be "extend- 
35 ed" to all additional places of even greater significance, 
and a 1 in the initially most significant place must be 
similarly "extended" to all additional places of even 
greater significance. Operations of this kind are some- 
times referred to as "sign extension.") 
40 [0070] in the example shown in FIG. 1 0 it is desirable 
to be able to automatically "sign extend" the 32-bit mul- 
tiplier output prior to its application to 40-bit adder logic 
132. FIG. 11 shows how this can be done in the multiplier 
output circuitry of this invention in accordance with afur- 
45 ther feature of the invention. With reference to FIG. 1 , 
FIG. 11 shows additional detail for the fourth region 30 
from the left in a representative super-region 20. In ac- 
cordance with earlier discussions, the two most signifi- 
cant outputs 120 of a multiplier 50 in another super-re- 
50 gion 20 are respectively applied to two PLCs 60*4a and 
60*4b. Similarly, the two most significant outputs 11 0 of 
the multiplier 50 in the same super-region 20 are respec- 
tively applied to two PLCs 60*4r and 60*4s. In addition, 
the most significant output 110 of that multiplier is ap- 
55 pHed to eight other PLCs 60*4z to facilitate extension of 
the sign on that most significant lead 110 to eight even 
more significant plac s of add r logic 132 (p rformed 
by eight additional logic iem nts 130 in the region 30 
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served by the local int rconnect outputs and other r - 
sources 70/72 of the depicted PLCs 60). This facilitates 
extension of the formerly most significant multiplier out- 
put bit on the most significant lead 1 1 0 to eight additional 
places of binary addition performed in the MAC. (As a 5 
possible alternative to what is shown in FIG. 11, local 
interconnect 70 could be programmed to apply the out- 
put of PLC 60*4s to eight additional logic elements 1 30. 
However, the more physical solution shown in FIG. 11 
may be preferred.) 

[0071] Although FIG. 11 shows additional PLC 60*4z 
inputs only for the most significant lead 110, it will be 
understood that the same thing could alternatively or ad- 
ditionally be done for the most significant lead 1 20. 
[0072] Another possible aspect of the invention re- 
lates to facilitating the provision of MAC operation hav- 
ing the ability to either add or subtract new multiplier in- 
puts. The objective is to provide MAC circuitry that can 
perform either of the following operations: 

MAC = MAC + INPUT (1) 

or 

MAC = MAC - INPUT (2) 

The conventional approach to handling the input sub- 
traction alternative is to two's complement the input and 
add the result to the previously accumulated value. But 
two's complementing requires adding a 1 to the inverted 
input, and this takes time and may be difficult to provide 
for at the appropriate point in programmable logic, which 
may be working with words of any of many possible 
lengths and locations relative to the physically fixed cir- 
cuit components. To amplify this last point, when a mul- 
tiplication smaller than the maximum for which the PLD 
is designed (e.g., a 12 x 12 multiplication in a PLD de- 
signed for maximum 16x16 multiplication), the multipli- 
er inputs generally occupy the most significant bits of 
the multiplier input bus. The result therefore occupies 
the most significant bits of the multiplier result bus. As 
a consequence, the least significant bit of the result will 
not start at the least significant bit of the adder. Its start 
location will vary, depending on the precision of the num- 
bers to be multiplied. 

[0073] To avoid these problems, one's complement 
(rather than two's complement) arithmetic can be used 
in accordance with this invention. (One's complement 
representation is similar to two's complement represen- 
tation, except that a 1 is not added when going from a 
positive numberto a negative number of equal absolute 
value (see FIG. 1 3).) The logic that underlies producing 
result (2) above using one's complement logic is as fol- 
lows (where T d not s one's complementing the item 
to which it is app nded as a prefix): 



IMAC + INPUT (3) 



= - MAC - 1 + INPUT (4) 



!(- MAC -1 + INPUT) (5) 



= MAC + 1 - INPUT -1 (6) 



= MAC - INPUT ' (7) 

Condensing lines (3)-(7) above, MAC - INPUT results 
from the following: 

I (I MAC + INPUT) (8) 

[0074] FIG. 1 4 shows how a representative logic mod- 
ule 130' can be enhanced in accordance with the inven- 
tion to facilitate performing the operation represented by 
expression (8) above. The output of flip-flop 134 is ap- 
plied to PLC 342 in both true (uninverted) and comple- 
ment (inverted) form. The inverted form is produced by 
inverter 340. PLC 342 is controlled by the output of PLC 
320 to select either of its inputs. The inputs to PLC 320 
are the output of FCE 31 2 and a possibly dynamic signal 
(e.g., from elsewhere in the logic or other circuitry of de- 
vice 10 (FIG. 1)). PLC 320 is controlled by FCE 310 to 
select either of its inputs as its output. Thus the control 
of PLC 342 can be either static (based on the pro- 
grammed state of FCE 312) or dynamic (based on the 
state of the "from logic" signal). 
[0075] The output signal of PLC 342 is fed back as 
one input to adder logic 1 32. The other input to the adder 
logic may come from a multiplier 50 as described earlier 
in this specification. The output of adder logic 1 32 is ap- 
plied to PLC 332 in both true and complement form. The 
complement form is produced by inverter 330. PLC 332 
is controlled in the same way and by the same control 
signal as PLC 342. Thus again, the control of PLC 332 
can be either static or dynamic. Based on the control 
input it receives, PLC 332 selects one of its two other 
input signals for application to flip-flop 134 as the new 
accumulated value. 

[0076] From the foregoing, it will be seen that logic 
element 1 30' can operate as an accumulator that either 
adds or subtracts its input signal. Moreover, logic ele- 
ment 130' can be programmed to either always add its 
input, always subtract its input, or add or subtract at var- 
ious times depending on th current state of the "from 
logic" signal in FIG. 14. 

[0077] As has air ady been mentioned, various fea- 
tures of this invention ar applicable to types of FSBs 
other than multipliers. For example, FIG. 15 shows an 
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illustrative super-region 20' that can b included in a 
PLD like PLD 10 (FIG. 1) and that includ s dedicated 
barrel shifter circuitry 400 in place of the dedicated par- 
all I multiplier circuitry 50 shown and described arlier. 
Barrel shifter circuitry 400 (which can be p r se conven- 
tional) may be capable of any or all of the functions typ- 
ically associated with barrel shifters. These functions 
generally include any of several types of shifts of the bits 
of a word applied to the barrel shifter In parallel. In ad- 
dition, the amount (number of places) by which the bits 
are shifted may be selectable. For example, barrel shift- 
er 400 may be capable of such shifts as "arithmetic shift 
right (or left) 0 , "logical shift right (or left)'*, "rotate right (or 
left)", etc. It may also be desirable to register or not reg- 
ister the outputs of barrel shifter 400. Barrel shifter 400 
has a variety of possible uses such as in digital signal 
processing ("DSP"), arithmetic computation, shifting 
bits as part of logic operations, etc. 
[0078] In the illustrative embodiment shown in FIG. 
1 5, barrel shifter 400 may have approximately 40 paral- 
lel inputs 100 (e.g., 32 bits ofthewordto be manipulated 
by the barrel shifter, two bits indicating the type of shift 
operation to be performed, and six bits indicating the 
number of places the data is to be shifted). As in the 
case of a multiplier 50 in FIG. 1 , the inputs 1 00 to barrel 
shifter 400 preferably come (at least for the most part) 
from the local interconnect 70 that otherwise supplies at 
least some of the main inputs 80 to two regions 30 of 
logic. Preferably the local interconnect 70 used for this 
purpose is relatively close to barrel shifter 400, but is 
not local interconnect that supplies main inputs 90 to 
memory region 40. Thus (as in the case of multiplier 50) 
barrel shifter 400 preferably does not require its own 
dedicated input routing. Only relatively local routing re- 
sources, that are already provided for other purposes, 
are taken up when barrel shifter 400 is to be used. The 
input routing for barrel shifter 400 requires no significant 
addition to either the local or global routing resources of 
the PLD, and this input routing does not even impose 
any significant additional burden on the global routing 
r sources that are provided. By sharing local routing 70 
with selected logic regions 30, use of barrel shifter 400 
does not interfere with simultaneous use of the memory 
region 40 in the same super-region 20' in the presently 
preferred embodiment. However, in an alternative em- 
bodiment a barrel shifter 400 could share local routing 
70 with an associated memory region 40. 
[0079] The output signals of barrel shifter 400 are 
preferably handled in very much the same way that the 
output signals of multipliers 50 in FIG. 1 are handled. In 
particular, dedicated, relatively local routing 410 is pro- 
vided for applying the output signals of barrel shifter 400 
to the local routing resources 60/70 of selected logic re- 
gions 30 in the super-region 20' that includes the barrel 
shift r. This routing allows the (e.g., 32) parallel output 
signals of barrel shifter 400 to be applied in parall I to a 
corr sponding number of logic el m nts 130. These 
logic elements can handl the barrel shifter output sig- 



nals in any of several ways. For example, the output 
drivers of or associated with the receiving logic el ments 
130 can be used to drive the barrel shifter output signals 
out into the more gen ral and global routing resources 

5 (e.g., 140/1 50) of the device, and this can be done either 
with or without registration of those signals by the reg- 
isters (flip flops) of or associated with the receiving logic 
elements. As another example, the programmable logic 
of the receiving logic elements 1 30 can be used to begin 

10 to further process the barrel shifter output signals, and 
then the resulting signals can be driven out via the logic 
element output drivers (either with or without registration 
by the logic element registers). 

[0080] Again, this arrangement for dealing with the 

15 output signals of barrel shifter 400 has a number of ad- 
vantages. For example, it avoids having to provide ad- 
ditional output drivers and registers that are dedicated 
for use by the barrel shifter. The initial use of dedicated 
local output routing 41 0 reduces the impact of barrel 

20 shifter operation on the more general and possibly glo- 
bal interconnection resources of the device. Feeding the 
barrel shifter output signals relatively directly into logic 
elements 130 allows any desired further processing of 
those signals to begin more immediately in those logic 

25 elements. 

[0081] FIG. 16 shows an illustrative PLD 10' in ac- 
cordance with the invention that includes both super-re- 
gions 20 of the type shown in more detail in FIG. 1 and 
super-regions 20' of the type shown in more detail in 

30 FIG. 15. In illustrative PLD 10' each column of super- 
regions includes several super-regions 20 having dedi- 
cated parallel multiplier circuits 50, and one super-re- 
gion 20' (at the bottom of the column) having dedicated 
parallel barrel shifter circuitry 400. This reflects an an- 

35 ticipated need for more multipliers than barrel shifters, 
but any ratio of multipliers to barrel shifters can be im- 
plemented. 

[0082] FIG. 1 7 shows how a representative portion of 
the illustrative circuitry shown and described earlier can 

40 be generalized to any of a wide range of function-spe- 
cific blocks ("FSBs") 500. FSB 500 in FIG. 1 7 is located 
and connected in the circuitry of super-region 20" where 
a multiplier 50 or a barrel shifter 400 is located in other, 
previously described FIGS. Thus FSB 500 has input 

45 (1 00) and output (510) routing generally similar to the 
input and output routing of a previously described mul- 
tiplier 50 or barrel shifter 400. Other specific examples 
of circuitry that FSB 500 can be are (1 ) a parallel arith- 
metic logic unit ("ALU"), (2) a parallel galois field ("GF") 

so multiplier, and (3) small multiplier arrays used for SIMD 
(single instruction, multiple data) processing. Still other 
examples will occur to those skilled in the art. 
[0083] FIG. 1 8 shows a more generalized illustration 
of what has been described above. A representative log- 

55 ic el ment 620 includes programmable logic (such as a 
programmable, four-input look-up table) 622, a regist r 
624 for registering the output of logic 622, a PLC 626 
for passing either the unregistered output of logic 622 
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or the output of register 624, and an output driver 628 
for driving the output of PLC 626 out into global routing 
resources 630 of PLD 610. Logic el ment 620 can get 
its inputs (or at least some of its primary data inputs) 
from global routing 630 via local routing 640. Local rout- 
ing 640 can be alternatively used to apply the outputs 
of parallel function-specific block 650 (e.g., of any of the 
types mentioned above) to the inputs of logic element 
620. As has already been pointed out, this arrangement 
has such advantages as avoiding the need to provide 
dedicated registers and/or output drivers for FSB 650. 
It also avoids the need to use global routing for the im- 
mediate outputs of FSB 650. 

[0084] FIG. 1 9 repeats what is shown in FIG. 1 8 and 
adds a generalization of what is shown and described 
earlier forthe input side of an FSB. In particular, FIG. 19 
shows FSB 650 getting its input signals from local inter- 
connection resources 640" that it shares with other logic 
elements 620'. This has the above-described advantag- 
es such as (1 ) avoiding the need to use global resources 
or to provide additional local resources for FSB 650 in- 
put, and (2) inherently giving FSB 650 the high degree 
of input routing flexibility that is already typically provid- 
ed for logic elements such as 620'. 
[0085] Although the earlier-described FIGS, (other 
than FIGS. 1 8 and 1 9) generally show inclusion of func- 
tion-specific blocks at a particular location in the illustra- 
tive PLD architecture, it will be understood that the in- 
vention is equally applicable to other PLD architectures 
and to other locations of FSBs in such architectures. 
Other modifications of the principles discussed above 
are also possible within the scope of the invention. 
FIGS. 20 and 21 show some examples of some of these 
variations. 

[0086] In FIG. 20 FSB 730 is basically disposed be- 
tween two regions 30 of programmable logic and the lo- 
cal interconnect 70 serving those two regions. FSB 730 
gets its inputs 720 from the local interconnect 70 of 
those two regions 30. Accordingly, FSB 730 shares this 
local interconnect 70 with those regions 30 and there- 
fore does not need its own additional, dedicated local 
interconnect for input. As in earlier-described embodi- 
ments, this has several advantages such as saving glo- 
bal interconnect, avoiding the need for additional local 
interconnect, and inherently giving FSB 730 the high de- 
gree of input routing flexibility that is typically provided 
in the input routing resources of logic regions 30. 
[0087] On the output side, FSB 730 shares with the 
adjacent logic regions 30 the output drivers 1 38 of the 
logic elements 130 that make up those regions. To en- 
able this to be illustrated, the output drivers 1 38 of the 
depicted logic elements 130 are shown separate from 
and outside the logic element boxes. The output signal 
1 37 of the other components of each logic element is 
applied to one input of an associated PLC 750. One of 
the output signals 740 of FSB 730 is applied to another 
input of each PLC 750. Each PLC 750 is programmable 
to select either of its inputs for application to an associ- 



ated output driver 138. The output 139 of each driver 
138 is applied to global interconnect 140. As in the ear- 
lier-described embodim nts, this sharing of output driv- 
ers 1 38 with logic elements 1 30 avoids the need to pro- 
s vide additional, d dicated output drivers for FSB 730. 
Note also that in this embodiment the output drivers 138 
that are thus "stolen" for use by FSB 730 are the output 
drivers of the same logic elements 130 whose input in- 
terconnect 70 is also "stolen" to provide inputs to the 
10 FSB. Thus the impact of using FSB 730 is confined to 
a large degree to just these logic elements 130. 
[0088] St II! another illustrative embodiment of the in- 
vention is shown in representative portion in FIG. 21 . In 
this embodiment FSB 830 is disposed between two pro- 
fs grammabie logic regions 30. FSB 830 gets its input sig- 
nals from the circuitry of the logic elements 130 in those 
regions 30. For example, the signals on the lead 137 
(see FIG. 6) in each of these logic elements 130 may 
be used as the inputs to FSB 830. Alternatively, any oth- 
20 er signal associated with these logic elements may be 
used as the FSB inputs. 

[0089] Each output of FSB 830 is applied to one input 
of a respective one of PLCs 850. Another input to each 
PLC 850 is a signal from a respective one of the logic 

25 elements 1 30 from which FSB 830 may get an Input sig- 
nal. (Although FIG. 21 shows the same logic element 
signals 137 going to both FSB 830 inputs and PLCs 850,. 
rt will be understood that these could alternatively be dif- 
ferent signals of the logic elements.) Each PLC 850 se- 

30 lects one of its input signals for application to an asso- 
ciated output driver 138. As in the embodiment shown 
in FIG. 20, drivers 1 38 can be the output drivers that are 
nominally part of the logic elements 1 30 from which FSB 
830 may get its inputs. Drivers 138 drive the signals ap- 

35 plied to them into global interconnect 140 as in the em- 
bodiment shown in FIG. 20. 

[0090] From the foregoing it will be seen that in the 
embodiment shown In FIG. 21, FSB 830 shares with the 
adjacent logic elements 130 the output drivers 138 of 

40 those logic elements. By getting its input signals 1 37 di- 
rectly from the adjacent logic elements 1 30, FSB 830 
also shares with those logic elements other resources 
of those logic elements. For example, FSB 830 takes 
advantage of the input routing resources 70 and possi- 

45 bly also the registers 1 34 of those logic elements, there- 
by avoiding the need to additional, separate, dedicated 
input routing resources for FSB 830 and/or for addition- 
al, separate, dedicated input registers forthe FSB. 
[0091] FIG. 22 illustrates a programmable logic de- 

so vice 10/107610/6107710/810 (hereinafter generically 
just 1 0) of this invention in a data processing system 
1 002. Data processing system 1 002 may include one or 
more of the following components: a processor 1004; 
memory 1 006; I/O circuitry 1 008; and peripheral devices 

55 1010. These components are coupled together by a sys- 
tem bus 1 020 and are populated on a circuit board 1 030 
which is contained in an end-user syst m 1040. 
[0092] System 1 002 can be us d in a wide variety of 
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applications, such as computer networking, data net- 
working, instrumentation, video processing, digital sig- 
nal processing, or any other application where the ad- 
vantage of using programmable or reprogrammable log- 
ic is desirable. Programmable logic device 1 0 can be 
used to perform a variety of different logic functions. For 
example, programmable logic device 10 can be config- 
ured as a processor or controller that works in cooper- 
ation with processor 1004. Programmable logic device 
1 0 may also be used as an arbiter for arbitrating access 
to a shared resource in system 1 002. In yet another ex- 
ample, programmable logic device 10 can be configured 
as an interface between processor 1 004 and one of the 
other components in system 1002. It should be noted 
that system 1 002 is only exemplary, and that the true 
scope and spirit of the invention should be indicated by 
the following claims. 

[0093] Various technologies can be used to imple- 
ment programmable logic devices 1 0 in accordance with 
this invention, as well as the various components of 
those devices (e.g., the above-described PLCs and the 
FCEs that may control the PLCs). For example, each 
PLC can be a relatively simple programmable connector 
such as a switch or a plurality of switches for connecting 
any one of several inputs to an output. Alternatively, 
each PLC can be a somewhat more complex element 
that is capable of performing logic (e.g., by logically 
combining several of its inputs) as well as making a con- 
nection. In the latter case, for example, each PLC can 
b product term logic, implementing functions such as 
AND, NAND, OR, or NOR. Examples of components 
suitable for implementing PLCs are EPROMs, EEP- 
ROMs, pass transistors, transmission gates, antifuses, 
laser fuses, metal optional links, etc. As has been men- 
tioned, the various components of PLCs can be control- 
I d by various, programmable, function control ele- 
ments ("FCEs"). (With certain PLC implementations (e. 
g. , fuses and metal optional links) separate FCE devices 
are not required.) FCEs can also be implemented in any 
of several different ways. For example, FCEs can be 
SRAMs, DRAMs, first-in first-out ("FIFO") memories, 
EPROMs, EEPROMs, function control registers (e.g., 
as in Wahlstrom U.S. patent 3,473,160), ferro-electric 
memories, fuses, antifuses, or the like. From the various 
examples mentioned above it will be seen that this in- 
vention is applicable to both one-time-only programma- 
ble and reprogrammable devices. 
[0094] It will be understood that the foregoing is only 
illustrative of the principles of the invention, and that var- 
ious modifications can be made by those skilled in the 
art without departing from the scope and spirit of the in- 
vention. For example, the various elements of this in- 
vention can be provided on a PLD in any desired num- 
bers and arrangements. 



Claims 

1 . Programmabi logic device circuitry comprising: 

5 a plurality of regions of programmable logic cir- 

cuitry; 

general purpose interconnection circuitry pro- 
grammably configurable to allow outputs of 
substantially any of the regions to be applied to 
10 inputs of substantially any of the regions; 

function-specific circuitry; and 
routing circuitry programmably configurable to 
route FSB outputs of the function-specific cir- 
cuitry to only a subset of the regions. 

15 

2. The circuitry defined in claim 1 wherein thef unction- 
specific circuitry is selected from the group consist- 
ing of parallel multiplier circuitry, parallel barrel shift- 
er circuitry, parallel arithmetic logic unit circuitry, 

20 parallel galois field multiplier circuitry, and parallel 
multiplier array circuitry for SIMD processing. 

3. The circuitry defined in claim 1 wherein the routing 
circuitry is adapted to route FSB outputs for output 

25 driving by output driver circuitry of the regions in the 
subset. 

4. The circuitry defined in claim 3 wherein the output 
driver circuitry of the regions in the subset is adapt- 

30 ed to drive signals into the general purpose inter- 
connection circuitry. 

5. The circuitry defined in claim 1 wherein the routing 
circuitry is adapted to route FSB outputs for regis- 

35 tering by register circuitry of the regions in the sub- 
set. 

6. The circuitry defined in claim 1 wherein the routing 
circuitry is adapted to route FSB outputs for 

40 processing by programmable logic circuitry of the 
regions in the subset. 

7. The circuitry defined in claim 1 further comprising: 

input routing circuitry adapted to selectively 
45 divert inputs of a subplurality of the regions to inputs 
of the function-specific circuitry. 

8. The circuitry defined in claim 1 further comprising: 

input routing circuitry adapted to apply signals 
so from circuitry in a subplurality of the regions to in- 
puts of the function-specific circuitry. 

9. A programmable logic device comprising: 

55 a plurality of regions of programmable logic, 

each of which is programmable to perform any 
of a plurality of logic functions on logic region 
input signals applied to that logic region in order 
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to produce a logic region output signal of that 
logic region; 

programmable general purpose interconnec- 
tion circuitry adapted to programmably selec- 
tively apply substantially any of the logic region 
output signals to substantially any of the logic 
regions as a logic region input signal of the last- 
mentioned logic region; 

function-specific circuitry adapted to perform a 
particular type of tasks on a plurality of FSB in- 
put signals applied to the function-specific cir- 
cuitry in parallel to produce a plurality of FSB 
output signals output by the function -specific 
circuitry in parallel; and 

special purpose routing circuitry adapted to se- 
Jectively route the FSB output signals to only a 
subset of the logic regions for use by circuitry 
of the logic regions in that subset. 

10. The device defined in claim 9 wherein the function- 
specific circuitry is selected from the group consist- 
ing of multiplier circuitry, barrel shifter circuitry, arith- 
metic logic unit circuitry, galois field multiplier cir- 
cuitry, and multiplier array circuitry for SIMD 
processing. 

11. The device defined in claim 9 wherein the circuitry 
of the logic regions in the subset that can be used 
by the FSB output signals, routed by the special pur- 
pose routing circuitry, comprises output driver cir- 
cuitry of those logic regions. 

12. The device defined in claim 11 wherein the output 
driver circuitry is adapted to apply output signals to 
the general purpose interconnection circuitry. 

13. The device defined in claim 9 wherein the circuitry 
of the logic regions in the subset that can be used 
by the FSB output signals, routed by the special pur- 
pose routing circuitry, comprises register circuitry of 
those logic regions. 

14. The device defined in claim 9 wherein the circuitry 
of the logic regions in the subset that can be used 
by the FSB output signals, routed by the special pur- 
pose routing circuitry, comprises programmable 
logic circuitry of those logic regions. 

15. 



16. 



17. 



gions in the subset. 

18. Thed vice defined in claim 1 5 wherein the logic r - 
gions in the subplurality are the logic regions in the 

5 subset. 

1 9. The device defined in claim 9 wherein the FSB input 
signals comprise signals taken from circuitry in a 
subplurality of the logic regions. 

10 

20. The device defined in claim 19 wherein the circuitry 
of the subplurality of logic regions from which the 
FSB input signals are taken comprises register cir- 
cuitry of those logic regions. 

15 

21 . The device defined in claim 1 9 wherein the circuitry 
of the subplurality of logic regions from which the 
FSB input signals are taken comprises programma- 
ble logic circuitry of those logic regions. 

20 

22. A digital processing system comprising: 

processing circuitry; 

a memory coupled to said processing circuitry; 
25 and 

a programmable logic device as defined in 
claim 9 coupled to the processing circuitry and 
the memory. 

30 23. A printed circuit board on which is mounted a pro- 
grammable logic device as defined in claim 9. 

24. The printed circuit board defined in claim 23 further 
comprising: 

35 a memory mounted on the printed circuit 

board and coupled to the programmable logic de- 
vice. 

25. The printed circuit board defined in claim 23 further 
40 comprising: 

processing circuitry mounted on the printed 
circuit board and coupled to the programmable logic 
device. 

45 26. Programmable logic device circuitry comprising: 

a plurality of regions of programmable logic, 
each of which is programmable to perform any 
of a plurality of logic functions on logic region 
input signals applied to that logic region in order 
to produce a logic region output signal of that 
logic region; 

programmable general purpose interconnec- 
tion circuitry adapted to programmably selec- 
tively apply substantially any of the logic region 
output signals to substantially any of the logic 
r gions as logic region input signals of the last- 
mentioned logic regions; 



The device defined in claim 9 wherein the FSB input 
signals comprise logic region input signals of a sub- so 
plurality of the logic regions. 

The device defined in claim 15 wherein the logic re- 
gions in the subplurality are mutually exclusiv of 
the logic regions in the subset. 55 

The device defined in claim 15 wherein the logic re- 
gions in the subplurality are at least partly logic r - 
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parallel multiplier circuitry adapt d to perform 
substantially parall I multiplication on a plural- 
ity of multipli r input signals applied to the mul- 
tiplier circuitry in parallel in order to produce a 
plurality of multiplier output signals output by 
the multiplier circuitry in parallel; and 
special purpose routing circuitry adapted to se- 
lectively route the multiplier output signals to 
only a subset of the logic regions for use by cir- 
cuitry of the logic regions in that subset. 

27. The circuitry defined in claim 26 wherein the circuit- 
ry of the logic regions in the subset that can be used 
by the multiplier output signals, routed by the spe- 
cial purpose routing circuitry, comprises output driv- 
er circuitry of those logic regions. 

28. The circuitry defined in claim 27 wherein the output 
driver circuitry is adapted to apply signals to the 
general purpose interconnection circuitry. 

29. The circuitry defined in claim 26 wherein the circuit- 
ry of the logic regions in the subset that can be used 
by the multiplier output signals, routed by the spe- 
cial purpose routing circuitry, comprises register cir- 
cuitry of those logic regions. 

30. The circuitry defined in claim 26 wherein the circuit- 
ry of the logic regions in the subset that can be used 
by the multiplier output signals, routed by the spe- 
cial purpose routing circuitry, comprises program- 
mable logic circuitry of those logic regions. 

31 . The circuitry defined in claim 26 wherein the circuit- 
ry of the logic regions in the subset that can be used 
by the multiplier output signals, routed by the spe- 
cial purpose routing circuitry, comprises accumula- 
tor circuitry adapted to arithmetically accumulate 
successive values respectively represented by the 
multiplier output signals in successive time inter- 
vals. 



next accumulated valu . 

34. The circuitry defined in claim 31 further comprising 
circuitry adapted to sign-extend the successive val- 

5 ues prior to their use in the accumulator circuitry. 

35. The circuitry defined in claim 26 wherein the multi- 
plier input signals comprise logic region input sig- 
nals of a subplurality of the logic regions. 



10 

36. The circuitry defined in claim 35 wherein the logic 
regions in the subplurality are mutually exclusive of 
the logic regions in the subset. 

15 37. The circuitry defined in claim 35 wherein the logic 
regions in the subplurality are at least partly logic 
regions in the subset. 

38. The circuitry defined in claim 35 wherein the logic 
20 regions in the subplurality are the logic regions in 

the subset. 

39. The circuitry defined in claim 26 wherein the multi- 
plier input signals comprise signals taken from cir- 

25 cuitry in a subplurality of the logic regions. 

40. The circuitry defined in claim 39 wherein the circuit- 
ry of the subplurality of logic regions from which the 
multiplier input signals are taken comprises register 

30 circuitry of those logic regions. 

41 . The circuitry defined in claim 39 wherein the circuit- 
ry of the subplurality of logic regions from which the 
multiplier input signals are taken comprises pro- 

35 grammable logic circuitry of those logic regions. 

42. The circuitry defined in claim 26 further comprising: 



r- 40 



32. The circuitry defined in claim 31 wherein the accu- 
mulator circuitry is adapted to employ a selectable 
one of addition and subtraction to arithmetically ac- 
cumulate the successive values. 

33. The circuitry defined in claim 31 wherein the accu- 
mulator circuitry is adapted to arithmetically accu- 
mulate by subtracting each successive value from 
a previous accumulated value, and wherein the ac- 
cumulator circuitry comprises first one's comple- 
ment circuitry adapted to one's-complement the ac- 
cumulated value, adder circuitry adapted to add a 
current one of the successive values to outputs of 
the first one's complement circuitry, and second 
one's complement circuitry adapted to one's-com- 
pl ment outputs of the adder circuitry to produce a 



45 
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55 



second parallel multiplier circuitry adapted to 
perform substantially parallel multiplication on 
a plurality of second multiplier input signals ap- 
plied to the second multiplier circuitry in parallel 
in order to produce a plurality of second multi- 
i plier output signals output by the second multi- 
plier circuitry in parallel; and 
second special purpose routing circuitry adapt- 
ed to selectively route the second multiplier out- 
put signals to only a subplurality of the logic re- 
gions for use by circuitry of the logic regions in 
that subplurality. 

43. The circuitry defined in claim 42 wherein at least 
some of the logic regions are common logic regions 
to both the subset and the subplurality. 

44. The circuitry defined in claim 43 wherein the com- 
mon logic regions are configurable to arithm tically 
combine the multiplier output signals and the sec- 
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ond multiplier output signals. 

45. The circuitry defined in claim 44 wherein the com- 
mon logic regions are further configurabl to regis- 
ter signals that result from arithmetically combining 5 
the multiplier output signals and the second multi- 
plier output signals. 

46. The circuitry defined in claim 44 wherein the com- 
mon logic regions comprise output driver circuitry 10 
usable for driving signals that result from arithmet- 
ically combining the multiplier output signals and 

the second multiplier output signals out into the gen- 
era! purpose interconnection circuitry. 

15 

47. Circuitry for progressively arithmetically accumulat- 
ing a succession of arithmetic values respectively 
represented by successive digital input signals to 
produce successive digital output signals respec- 
tively indicative of successive accumulated values 20 
comprising: 

first one's complement circuitry adapted to 
one's-complement each successive accumu- 
lated value; 25 
adder circuitry adapted to successively add 
each successive arithmetic value to concurrent 
outputs of the first one's complement circuitry; 
and 

second one's complement circuitry adapted to 30 
one's-complement outputs of the adder circuit- 
ry to produce a next successive accumulated 
value. 

48. The circuitry defined in claim 47 further comprising: 35 

sign extension circuitry adapted to sign-ex- 
tend each successive arithmetic value. 

49. The circuitry defined in claim 47 further comprising: 

alternative circuitry adapted to selectively *o 
cause the first and second one's complement cir- 
cuitry to pass values without one's complementing 
them. 

50. The circuitry defined in claim 49 further comprising: *s 

programmable circuitry adapted to control 
whether the alternative circuitry is operative. 

51. The circuitry defined in claim 49 further comprising: 

control circuitry adapted to apply a time-vary- so 
ing signal to the alternative circuitry so that the al- 
ternative circuitry is operative only at certain times. 
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